EN FR
EN FR


Section: New Results

Model checking within SimGrid

Participants : Stephan Merz, Martin Quinson [of project team AlGorille] , Cristián Rosa.

For several years we have cooperated with Martin Quinson from the AlGorille project team on adding model checking capabilities to the simulation platform SimGrid for message-passing distributed C programs. The expected benefit of such an integration is that programmers can complement simulation runs by exhaustive state space exploration in order to detect errors such as race conditions that would be hard to reproduce by testing. Indeed, a simulation platform provides a controlled execution environment that mediates interactions between processes, and between processes and the environment, and thus provides the basic functionality for implementing a model checker. The principal challenge is the state explosion problem, as a naive approach to the systematic generation of all possible process interleavings would be infeasible beyond the most trivial programs. Moreover, it is impractical to store the set of global system states that have already been visited: the programs under analysis are arbitrary C programs with full access to the heap, making it difficult and costly to store global states and to determine if two states are equal.

We have implemented a stateless model checker within the SimGrid platform, for verifying safety properties of distributed C programs that communicate by message passing. The visible actions correspond to the communication events, at which points programs can be interrupted by the simulation core. In order to mitigate state explosion, the exploration relies on Dynamic Partial-Order Reduction (DPOR) that avoids exploring redundant interleavings corresponding to the same global happens-before relation. We have identified four primitive communication actions, in terms of which the different message-passing libraries provided by SimGrid can be implemented, and have proved independence theorems for these primitives that underly our DPOR exploration algorithm. We thus obtain a small kernel that supports different communication APIs; nevertheless, practical evaluations yield similar reductions as those obtained by Li et al. [30] for a much more detailed analysis of a fragment of the MPI library.

The model checker SimGridMC is now part of the SimGrid platform and allows programmers to either perform simulation or model checking runs based on the same source code. It has allowed us to discover a non-trivial bug in an implementation of the Chord algorithm for realizing a distributed hashtable over a P2P network. A conference paper has been published at FORTE 2011 [13] . Cristián Rosa successfully defended his PhD thesis [7] in October 2011, which also proposes efficient techniques for parallelizing simulation runs in SimGrid. Marion Guthmuller has explored extensions of our model checking algorithm for verifying liveness properties, and has started working on her PhD thesis in this area in the fall of 2011.